{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "8d3e3a9c",
   "metadata": {},
   "source": [
    "(create-and-use-functions)=\n",
    "# Create functions"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bfe1aea0",
   "metadata": {},
   "source": [
    "Functions are the basic building blocks of MLRun: they are essentially Python objects that know how to run locally or on a Kubernetes cluster. This section covers how to create and customize an MLRun function, as well as common parameters across all functions."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3d68f026",
   "metadata": {},
   "source": [
    "**In this section**\n",
    "- [Naming functions: best practice](#naming-functions---best-practice)\n",
    "- [Building images](#building-images)\n",
    "- [Creating functions](#creating-functions)\n",
    "- [Create a function with a single source file](#create-a-function-with-a-single-source-file)\n",
    "- [Create a function with multiple source files](#create-a-function-with-multiple-source-files)\n",
    "- [Import or use an existing function](#import-or-use-an-existing-function)\n",
    "- [Set requirements when creating a function](#set-requirements-when-creating-a-function)\n",
    "- [Customizing functions](#customizing-functions)\n",
    "\n",
    "**See also**\n",
    "{ref}`workflows` for more details and examples about running functions"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a9531501",
   "metadata": {},
   "source": [
    "## Naming functions - best practice\n",
    "\n",
    "When you create a function, you specify its name. When you deploy a function using the SDK, MLRun appends the project name to the function name. Project names are limited to 63 characters. You must ensure that the combined function-project name does not exceed 63 characters. A function created in the UI has a default limit of 56 characters. MLRun adds to nuclio functions, by default, \"nuclio-\", giving a total of 63 characters."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c9005e5",
   "metadata": {
    "vscode": {
     "languageId": "plaintext"
    }
   },
   "source": [
    "<a id=\"building-images\"></a>\n",
    "## Building images\n",
    "If your MLRun function requires additional libraries or files, you might need to build a new Docker image. You can do this by specifying a base image to use as the `image`, your requirements via `requirements`, and (optionally) your source code via `with_repo=True` (where the source is specified by `project.set_source(...)`). See the examples in [Set requirements when creating a function](#set-requirements-when-creating-a-function)\n",
    "See more details about images in [MLRun images](./images.md#mlrun-runtime-images) and more information on when a build is required in [Build function image](./image-build.md).\n",
    "\n",
    "``` {admonition} Note\n",
    "When using `with_repo`, the contents of the Git repo or archive are available in the current working directory of your MLRun function during runtime.\n",
    "```\n",
    "\n",
    "A good place to start is one of the default MLRun images:\n",
    "- `mlrun/mlrun`: An MLRun image includes preinstalled OpenMPI and other ML packages. Useful as a base image for simple jobs.\n",
    "- `mlrun/mlrun-gpu`: The same as `mlrun/mlrun` but for GPUs, including Open MPI.\n",
    "\n",
    "Dockerfiles for the MLRun images can be found [**here**](https://github.com/mlrun/mlrun/tree/development/dockerfiles)."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cd31f4a3",
   "metadata": {},
   "source": [
    "<a id=\"creating-functions\"></a>\n",
    "## Creating functions"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bada099b",
   "metadata": {},
   "source": [
    "When creating a function, there are three main scenarios:\n",
    "- [**Single source file**](#create-a-function-with-a-single-source-file) &mdash; when your code can be contained in a single file\n",
    "- [**Multiple source files**](#create-a-function-with-multiple-source-files) &mdash; when your code requires additional files or dependencies\n",
    "- [**Import an existing function**](#import-or-use-an-existing-function) &mdash; when your function already exists elsewhere and you just want to import it\n",
    "\n",
    "\n",
    "The MLRun project object has a method called {py:meth}`~mlrun.projects.MlrunProject.set_function`, which is a one-size-fits-all way of creating an MLRun function. This method accepts a variety of sources including Python files, Jupyter Notebooks, Git repos, and more. Its usage has many variations, as you can see in the following code snippets. The return value of `set_function` is your MLRun function. You can immediately run it or apply additional configurations like resources, scaling, etc. See [Customizing functions](#customizing-functions) for more details. Depending on the source passed in, the project registers the function using some lower level functions. For specific use cases, you also have access to the lower level functions {py:func}`~mlrun.run.new_function`, {py:func}`~mlrun.run.code_to_function`, and {py:func}`~mlrun.run.import_function`.\n",
    "\n",
    "Functions should be created within a project (see [**create and use projects**](../projects/create-project.md)).\n",
    "The initial steps to create a function usually look like:\n",
    "\n",
    "```python\n",
    "project = mlrun.get_or_create_project(...)\n",
    "\n",
    "fn = project.set_function(...)\n",
    "```\n",
    "The examples in this page assume you are working within a project."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bb9cebf9",
   "metadata": {},
   "source": [
    "The `set_function` parameters, used across all function types and creation scenarios, are: \n",
    "\n",
    "- **name:** Name of your MLRun function within the given project. This is displayed in the MLRun UI together with the Kubernetes pod name.\n",
    "\n",
    "- **tag:** Tag for your function (much like a Docker image). Omitting this parameter defaults to `latest`.\n",
    "\n",
    "- **func:** What to run with the MLRun function. This can be a number of things including files (`.py`, `.ipynb`, `.yaml`, etc.), URIs (`hub://` prefixed Hub URI, `db://` prefixed MLRun DB URI), existing MLRun function objects, or `None` (for current `.ipynb` file).\n",
    "\n",
    "- **image:** Docker image to use when containerizing the piece of code. If you also specify the `requirements` parameter to build a new Docker image, the `image` parameter is used as the base image.\n",
    "\n",
    "- **kind:** Runtime the MLRun function uses. See [Kinds of functions (runtimes)](../concepts/functions-overview.md) for the list of supported batch and real-time runtimes.\n",
    "\n",
    "- **handler:** Default function handler to invoke (e.g. a Python function within your code). This handler can also be overridden when executing the function. For example, `project.run_function(\"func1\", handler=\"f2\")` executes the `f2` Python function, even if the default handler is `f1`.\n",
    "\n",
    "- **requirements:** Additional Python dependencies needed for the function to run. Using this parameter results in a new Docker image (using the `image` parameter as a base image). This can be a list of Python dependencies or a path to a `requirements.txt` file. See examples in [Set requirements when creating a function](#set-requirements-when-creating-a-function).\n",
    "\n",
    "- **with_repo:** Set to `True` if the function requires additional files or dependencies from a Git repo or archive file. This Git repo or archive file is specified on a project level via `project.set_source(...)`, which the function consumes. If this parameter is omitted, the default is `False`.\n",
    "\n",
    "<a id=\"creating-functions-reqs\"></a>\n",
    "### Set requirements when creating a function\n",
    "\n",
    "You can list the Python dependencies, for example:\n",
    "```python\n",
    "# By providing a list of packages\n",
    "fn = project.set_function(\n",
    "    name=\"my-func\", tag=\"latest\", func=\"./src/my-code.py\",\n",
    "    image=\"mlrun/mlrun\", kind=\"job\", handler=\"train_model\",\n",
    "    requirements=[\"requests\", \"pandas==1.3.5\"], with_repo=True\n",
    ")\n",
    "```\n",
    "And you can list a path to a requirements.txt file, for example:\n",
    "\n",
    "```python\n",
    "# By providing a path to a pip requirements file\n",
    "proj.set_function(\"my-func\", requirements=\"requirements.txt\")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8e827478",
   "metadata": {},
   "source": [
    "<a id=\"single-source-file\"></a>\n",
    "## Create a function with a single source file"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "be147543",
   "metadata": {},
   "source": [
    "The simplest way to create a function is to use a single file as the source. The code itself is embedded into the MLRun function object. This makes the function quite portable since it does not depend on any external files. You can use any source file supported by MLRun such as Python or Jupyter notebook. \n",
    "Files of type Bash, Go, etc. are also supported."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "41f62728",
   "metadata": {},
   "source": [
    "### Create a function with a Python file"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bf09c489",
   "metadata": {},
   "source": [
    "This is the simplest way to create a function out of a given piece of code. Simply pass in the path to the Python file *relative to your project context directory*."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "af00fedd",
   "metadata": {},
   "source": [
    "```python\n",
    "fn = project.set_function(\n",
    "    name=\"my-func\", func=\"./src/my-code.py\", kind=\"job\",\n",
    "    image=\"mlrun/mlrun\", handler=\"handler\"\n",
    ")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5ab01939",
   "metadata": {},
   "source": [
    "### Create a function with Jupyter Notebook"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d5c252fd",
   "metadata": {},
   "source": [
    "This is a great way to create a function out of a Jupyter Notebook. Just pass in the path to the Jupyter Notebook  *relative to your project context directory*. You can use [**MLRun cell tags**](./mlrun_code_annotations.ipynb) to specify which parts of the notebook should be included in the function. \n",
    "\n",
    "``` {admonition} Note\n",
    "To ensure that the latest changes are included, make sure you save your notebook before creating/updating the function.\n",
    "```\n",
    "\n",
    "```python\n",
    "proj.set_function(\"http://.../my-nb.ipynb\", \"my-func\", kind=job image=\"mlrun/mlrun\" )\n",
    "```"
   ]
  },
  {
   "cell_type": "raw",
   "id": "5ec5c088",
   "metadata": {
    "vscode": {
     "languageId": "raw"
    }
   },
   "source": [
    "```python\n",
    "fn = project.set_function(\n",
    "    name=\"my-func\", func=\"./src/my-code.ipynb\", kind=\"serving\", \n",
    "    image=\"mlrun/mlrun\", requirements=[\"pandas==1.3.5\"]\n",
    ")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6b36faf7",
   "metadata": {},
   "source": [
    "You can also create an MLRun function out of the current Jupyter Notebook you are running in. To do this, simply ommit the `func` parameter in `set_function`. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0204fd5d",
   "metadata": {},
   "source": [
    "<a id=\"multiple-source-files\"></a>\n",
    "## Create a function with multiple source files"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4f32f1ea",
   "metadata": {},
   "source": [
    "If your code requires additional files or external libraries, you need to use a source that supports multiple files such as Git, an archive (zip/tar), or V3IO file share. This approach (especially using a Git repo) pairs well with MLRun projects.\n",
    "\n",
    "To do this, you must:\n",
    "- Provide `with_repo=True` when creating your function via `project.set_function(...)`\n",
    "- Set the project source via `project.set_source(source=...)`\n",
    "\n",
    "This instructs MLRun to load source code from the git repo/archive/file share associated with the project. There are two ways to load these additional files:\n",
    "\n",
    "### Load code from container\n",
    "In this scenario, the function is built once. **This is the preferred approach for production workloads**. Here are a few examples:\n",
    "\n",
    "Example of a python function and handler:\n",
    "```python\n",
    "project.set_source(source=\"git://github.com/mlrun/project-archive.git\")\n",
    "\n",
    "fn = proj.set_function(\n",
    "    func=\"./src/my-code.py\", name=\"my-func\", \n",
    "    image=\"mlrun/mlrun\", with_repo=True, handler=\"my_handler\"\n",
    ")\n",
    "```\n",
    "\n",
    "Example of a python function contained in the handler:\n",
    "```python\n",
    "project.set_source(source=\"git://github.com/mlrun/project-archive.git\")\n",
    "\n",
    "fn = project.set_function(\n",
    "    name=\"my-func\", handler=\"job_func.job_handler\",\n",
    "    image=\"mlrun/mlrun\", kind=\"job\", with_repo=True,\n",
    ")\n",
    "```\n",
    "\n",
    "When the py function is inside a folder, the `handler` looks like:\n",
    "```python\n",
    "project.set_source(source=\"git://github.com/mlrun/project-archive.git\")\n",
    "\n",
    "fn = project.set_function(\n",
    "    name=\"my-func\", handler=\"folder1.job_func.job_handler\",\n",
    "    image=\"mlrun/mlrun\", kind=\"job\", with_repo=True,\n",
    ")\n",
    "```\n",
    "\n",
    "\n",
    "### Load code at runtime\n",
    "The function pulls the source code at runtime. **This is a simpler approach during development that allows for making code changes without re-building the image each time**. For example:\n",
    "\n",
    "```python\n",
    "archive_url = \"https://s3.us-east-1.wasabisys.com/iguazio/project-archive/project-archive.zip\"\n",
    "project.set_source(source=archive_url, pull_at_runtime=True)\n",
    "\n",
    "fn = project.set_function(\n",
    "    name=\"my-func\", handler=\"nuclio_func:nuclio_handler\",\n",
    "    image=\"mlrun/mlrun\", kind=\"nuclio\", with_repo=True,\n",
    ")\n",
    "```\n",
    "### Load code at runtime using a non-default source\n",
    "If your project already has a default source, you can still use a different source at runtime or build of a function. \n",
    "Based on the previous example:\n",
    "\n",
    "```python\n",
    "archive_url = \"https://s3.us-east-1.wasabisys.com/iguazio/project-archive/project-archive.zip\"\n",
    "project.set_source(source=archive_url, pull_at_runtime=True)\n",
    "\n",
    "fn = project.set_function(\n",
    "    name=\"my-func\", handler=\"nuclio_func:nuclio_handler\",\n",
    "    image=\"mlrun/mlrun\", kind=\"nuclio\", with_repo=True,\n",
    ")\n",
    "fn.with_source_archive(\"https://s3.us-east-1.wasabisys.com/some/other/archive.zip\")\n",
    "```\n",
    "See {py:meth}`LocalRuntime.with_source_archive <mlrun.runtimes.LocalRuntime.with_source_archive>`, {py:meth}`KubejobRuntime.with_source_archive <mlrun.runtimes.KubejobRuntime.with_source_archive>`, {py:meth}`RemoteRuntime.with_source_archive <mlrun.runtimes.RemoteRuntime.with_source_archive>`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "969cc92e",
   "metadata": {},
   "source": [
    "<a id=\"import-existing-function\"></a>\n",
    "## Import or use an existing function"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0c5adbe0",
   "metadata": {},
   "source": [
    "If you already have an MLRun function that you want to import, you can do so from multiple locations such as YAML, Function Hub, and MLRun DB.\n",
    "\n",
    "```{admonition} Note\n",
    "In the UI, running a batch job from an existing function executes the generated spec merged with the function spec. Therefore, if you remove a function spec, for example env vars, it may re-appear in the final job spec.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "36390042",
   "metadata": {},
   "source": [
    "### YAML"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ed7f8c84",
   "metadata": {},
   "source": [
    "MLRun functions can be exported to YAML files via `fn.export()`. These YAML files can then be imported via the following:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7d5f5e26",
   "metadata": {},
   "source": [
    "```python\n",
    "fn = project.set_function(name=\"my-func\", func=\"my-code.yaml\")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "94ffa4b2",
   "metadata": {},
   "source": [
    "### Function hub"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8a65c196",
   "metadata": {},
   "source": [
    "Functions can also be imported from the [**MLRun Hub**](https://www.mlrun.org/hub): simply import using the name of the function and the `hub://` prefix:\n",
    "``` {admonition} Note\n",
    "By default, the `hub://` prefix points to the MLRun Hub. You can substitute your own repo. See {ref}`git-repo-as-hub`.\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "89e8a5ad",
   "metadata": {},
   "source": [
    "```python\n",
    "fn = project.set_function(name=\"describe\", func=\"hub://describe\")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "fd2d5a82",
   "metadata": {},
   "source": [
    "### MLRun DB"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "66397b78",
   "metadata": {},
   "source": [
    "You can also import functions directly from the MLRun DB. These could be functions that have not been pushed to a git repo, archive, or Function Hub. Import via the name of the function and the `db://` prefix:"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a69e3f7f",
   "metadata": {},
   "source": [
    "```python\n",
    "fn = project.set_function(name=\"my-func\", func=\"db://import\")\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "58c92849",
   "metadata": {},
   "source": [
    "### MLRun function"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "69c0d1c5",
   "metadata": {},
   "source": [
    "You can also directly use an existing MLRun function object. This is usually used when more granular control over function parameters is required (e.g. advanced parameters that are not supported by {py:meth}`~mlrun.projects.MlrunProject.set_function`).\n",
    "\n",
    "This example uses a {ref}`real-time serving pipeline (graph) <serving-graph>`."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7ceb2812",
   "metadata": {},
   "source": [
    "```python\n",
    "project = mlrun.get_or_create_project(\"my-project\")\n",
    "fn = project.set_function(name=\"my-func\", kind=\"serving\", image=\"mlrun/mlrun\", with_repo=True)\n",
    "graph = serving.set_topology(\"flow\")\n",
    "graph.to(name=\"double\", handler=\"mylib.double\") \\\n",
    "     .to(name=\"add3\", handler=\"mylib.add3\") \\\n",
    "     .to(name=\"echo\", handler=\"mylib.echo\").respond()\n",
    "```"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b6467c47",
   "metadata": {},
   "source": [
    "## Customizing functions\n",
    "\n",
    "Once you have created your MLRun function, there are many customizations you can add, including Memory / CPU / GPU resources, \n",
    "preemption mode, pod priority, node selection, and more. See full details in {ref}`configuring-job-resources`."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.13"
  },
  "vscode": {
   "interpreter": {
    "hash": "916dbcbb3f70747c44a77c7bcd40155683ae19c65e1c03b4aa3499c5328201f1"
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
